Novel Decompositions of Proper Scoring Rules for Classification: Score Adjustment as Precursor to Calibration
نویسندگان
چکیده
There are several reasons to evaluate a multi-class classifier on other measures than just error rate. Perhaps most importantly, there can be uncertainty about the exact context of classifier deployment, requiring the classifier to perform well with respect to a variety of contexts. This is commonly achieved by creating a scoring classifier which outputs posterior class probability estimates. Proper scoring rules are loss evaluation measures of scoring classifiers which are minimised at the true posterior probabilities. The well-known decomposition of the proper scoring rules into calibration loss and refinement loss has facilitated the development of methods to reduce these losses, thus leading to better classifiers. We propose multiple novel decompositions including one with four terms: adjustment loss, post-adjustment calibration loss, grouping loss and irreducible loss. The separation of adjustment loss from calibration loss requires extra assumptions which we prove to be satisfied for the most frequently used proper scoring rules: Brier score and log-loss. We propose algorithms to perform adjustment as a simpler alternative to calibration.
منابع مشابه
Loss Functions for Binary Class Probability Estimation and Classification: Structure and Applications
What are the natural loss functions or fitting criteria for binary class probability estimation? This question has a simple answer: so-called “proper scoring rules”, that is, functions that score probability estimates in view of data in a Fisher-consistent manner. Proper scoring rules comprise most loss functions currently in use: log-loss, squared error loss, boosting loss, and as limiting cas...
متن کاملLikelihood-ratio calibration using prior-weighted proper scoring rules
Prior-weighted logistic regression has become a standard tool for calibration in speaker recognition. Logistic regression is the optimization of the expected value of the logarithmic scoring rule. We generalize this via a parametric family of proper scoring rules. Our theoretical analysis shows how different members of this family induce different relative weightings over a spectrum of applicat...
متن کاملStrictly Proper Scoring Rules, Prediction, and Estimation
Scoring rules assess the quality of probabilistic forecasts, by assigning a numerical score based on the forecast and on the event or value that materializes. A scoring rule is strictly proper if the forecaster maximizes the expected score for an observation drawn from the distribution F if she issues the probabilistic forecast F , rather than any G 6= F . In prediction problems, strictly prope...
متن کاملDecompositions of Proper Scores
Scoring rules are an important tool for evaluating the performance of probabilistic forecasts. A popular example is the Brier score, which allows for a decomposition into terms related to the sharpness (or information content) and to the reliability of the forecast. This feature renders the Brier score a very intuitive measure of forecast quality. In this paper, it is demonstrated that all stri...
متن کاملThe PAV algorithm optimizes binary proper scoring rules
There has been much recent interest in application of the pool-adjacent-violators (PAV) algorithm for the purpose of calibrating the probabilistic outputs of automatic pattern recognition and machine learning algorithms. Special cost functions, known as proper scoring rules form natural objective functions to judge the goodness of such calibration. We show that for binary pattern classifiers, t...
متن کامل